Sequencing protocols, best practice variant calling and filtering
Per Unneberg
NBIS
11/22/22
Data generation
Population genomics - the data
Since the goal of population genomics is to analyze variation in a set of individuals, data generation consists of compiling variation data from individuals. Here the focus is on next-generation sequencing data.
Error in read_excel(basename(url)): could not find function "read_excel"
Error in `colnames<-`(`*tmp*`, value = c("Date", "Mb", "Genome")): attempt to set 'colnames' on an object with less than two dimensions
Error in `ggplot()`:
! `data` cannot be a function.
ℹ Have you misspelled the `data` argument in `ggplot()`
1%-10% for some analyses (PCA/admixture/LD/\(\mathsf{F_{ST}}\)
Restricting analysis to a predefined site list
List of global SNPs
Use global call set for analyses requiring shared sites
Refs
Lou, R. N., Jacobs, A., Wilder, A. P., & Therkildsen, N. O. (2021). A beginner’s guide to low-coverage whole genome sequencing for population genomics. Molecular Ecology, 30(23), 5966–5993. https://doi.org/10.1111/mec.16077
Talla, V., Soler, L., Kawakami, T., Dincă, V., Vila, R., Friberg, M., Wiklund, C., & Backström, N. (2019). Dissecting the Effects of Selection and Mutation on Genetic Diversity in Three Wood White (Leptidea) Butterfly Species. Genome Biology and Evolution, 11(10), 2875–2886. https://doi.org/10.1093/gbe/evz212